Pseudocode should be:
Pseudocoding is an iterative process.
Write a function is_TRUE() which inputs a vector x and, for each element in x, outputs TRUE if and only if that element of x is TRUE and outputs FALSE otherwise.
This, as is, is already a very good, plain English, description of what is to be done. Though there is some ambiguity of languge which might prove problematic were we to simply code it directly as it is written. While it may be understood we need to return a vector, it isn’t explicitly stated.
In fact, taken literally, it says to return a value of TRUE, for each element in our input vector, but a function can only return once and can only return one object. So, a Maliciously Compliant Actor (MCA) might write a function that looks something like this:
Do you see the problem?
We will exit the function in the first iteration every time.
When communicating among people we (generally) aren’t assuming they are going to take our words hyper-literally or deliberately interpret them in the worst possible way. We don’t have that same luxury when we are dealing with computers, so our first step might (should) be to eliminate as much ambiguity as possible without making it overly complicated or ponderous.
We can do that by explicitly stating we will be returning a vector and being more clear about what is being set to TRUE or FALSE (the elements of the return vector).
The is_TRUE() function inputs a vector x and, returns a vector where, for each element in x, if and only if the element in x is TRUE, the corresponding element of the return vector will be TRUE, otherwise it will be FALSE.
For the very first pass all we should do is format our sentence into a pseudocode structure. List the function name, what goes in, what comes out, and then, having identified the control flow elements in our descriptive sentence, start each one on its own line and indent and nest the statements as needed.
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
FOR each element in x
IF the element in x is identical to TRUE
SET the corresponding element of the output vector to TRUE
ELSE
SET the corresponding element of the output vector to FALSE
RETURN the output vector
At this point, we have pseudocode*.
*We have a long way to go before we rest though. While many of you could, no doubt, successfully code a working function from this point, the goal is to make it foolproof. This would generally be less than what we are looking for from you. Let’s move on.
Making our pseudocode complete.
the element in x
This is a little ambiguous, so we can start out by naming this element, e.
And,
the corresponding element of the output vector
Let’s name this output vector y.
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: y, a logical vector whose values are only TRUE or FALSE.
y <- a logical vector the same length as x
FOR each element, e, in x
IF e identical to TRUE
SET the corresponding element of y to TRUE
ELSE
SET the corresponding element of y to FALSE
RETURN y
Nice! That’s a bit more clear and should be complete. We could generate useful code from this, but what else could be clarified or made more precise?
identically TRUE
What does this mean, exactly?
TRUE. But 1 == TRUE too.IF will throw an error if it cannot make a TRUE or FALSE determination so we need the value to not be unknown.FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
y <- a logical vector the same length as x
FOR each element, e, in x
IF e = TRUE AND e is a logical value AND e is not unknown
SET the corresponding element of y to TRUE
ELSE
SET the corresponding element of y to FALSE
RETURN y
Alright, this is looking much more clear and feels complete and, if you wanted to, you could go ahead and code this now. Though, this is not concise enough to give us great results.
The implementation I’ve provided here is a direct translation, you can write a vectorized version of this pseudocode, but it’s better to eliminate the extraneous parts of the pseudocode before you start thinking about vectorization.
Now, let’s check how we did…
[1] TRUE FALSE NA
[1] TRUE FALSE FALSE
[1] -1 0 1
[1] FALSE FALSE FALSE
Making our pseudocode more concise.
Take a look at the current state of our pseudocode:
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
y <- a logical vector the same length as x
FOR each element, e, in x
IF e = TRUE AND e is a logical value AND e is not unknown
SET the corresponding element of y to TRUE
ELSE
SET the corresponding element of y to FALSE
RETURN y
Do you notice anything about our conditional?
When you set an object to TRUE or FALSE based on the condition in an IF statement, you can generally eliminate the IF and just set the object’s value based on the condition. See the simplified example code below:
Do you see how, if a == TRUE then b == TRUE and if a == FALSE then b == FALSE? Woulding it be simpler to just set b <- a?
Also, consider this code:
Do you notice the == TRUE part is redundant?
I claim that a == TRUE is logically equivalent to a regardless of what a is. Think it through until you have convinced yourself of this fact.
Now, let’s implement these changes in our pseudocode.
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
y <- a logical vector the same length as x
FOR each element, e, in x
corresponding element of y <- e = TRUE AND e is a logical value AND e is not unknown
RETURN y
Now this is something which we can easily and cleanly implement both directly and as a vectorized R solution.
Bonus pseudocode!
While we’ve said pseudocode is language agnostic (and that is, by and large, true) if you program extensively or exclusively in one language you’ll be okay if, in the pseudocode you write for yourself, you include language specific ideas or capabilities. The pseudocode police won’t be busting down your door if you mentiion certain functions you intend to use or if you plan your vectorization out prior to starting to code. So here is a possible pseudocode example for this problem, tailored specifically for R.
We can generally vectorize a for loop by removing the looping structure and addressing the objects as wholes rather than their elements individually.
Vectorization frees us from needing to know the size of the vector we are working with, so there is no need for n. We also do not need to pre-allocate our return vector, and since we would now only be accessing y once when we create it, there is no need to store it as a variable, only to then, immediately, return it.
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
RETURN the x values AND x is a logical vector AND the values of x are not unknown
[1] TRUE FALSE NA
[1] TRUE FALSE FALSE
[1] -1 0 1
[1] FALSE FALSE FALSE
You’ll notice we avoided in our pseudocode any mention of indices in our loop, this is by design. It’s not wrong to use indices of iteration in your pseudocode but it can lead to mistakes based on erroneous assumptions and language specific implementation. For instance, some languages (python, C++) are “zero-indexed” meaning the first element of a vector is element 0, so putting in your pseudocode your for loop iterates from 1 to n will definitely cause issues, especially if implemented directly as written. Likewise, a python or C++ programmer who pseudocodes something where the intened range of iteration is over all of the elements after the first, may indicate something like:
n <- number of books on the shelf
for i in 1 to n - 1
Which in R would do everything but the final element (or if you weren’t careful in your implementation), might result in you doing a for loop where i is 1 and 0 when n is 1.
For that reason it is almost always preferable to avoid specifying indices and instead say things like
for each book after the first
While the pseudocode from Pass 3 is what we are looking for from you, if you were working with a much larger function or wanted to do an additional pass prior to coding your function, you could do something like the following. Notice we are still trying to remain as language agnostic as possible by not specifying the index boundaries of x.
You’ll notice this starts to look very code-like though and borders on not being “pseudo”-code anymore.
FUNCTION: is_TRUE
INPUTS: x, a vector.
OUTPUT: a logical vector whose values are only TRUE or FALSE.
y <- a logical vector the same length as x
FOR i in the indices of x
y[i] <- x[i] AND x[i] is a logical value AND x[i] is not unknown
RETURN y
Writing good pseudocode is not an arcane art, but it does require practice to become comfortable with and good at. In time you’ll write more complete and concise pseudocode easier and earlier, and you will end up doing fewer revisions to get your pseudocode to a place where it is ready to be translated into code.